Best-Effort Refresh Strategies for Content-Based RSS Feed Aggregation

نویسندگان

  • Roxana Horincar
  • Bernd Amann
  • Thierry Artières
چکیده

During the past several years RSS-based content syndication has become a standard technique for efficiently and timely disseminating information on the web. From a data processing perspective RSS feeds are standard XML resources which are periodically refreshed by feed aggregators for generating continuous streams of items. In this article, we study the problem of information loss in the context of a content-based feed aggregation system and we propose a new best-effort refresh strategy for RSS feeds under limited bandwidth. This strategy is evaluated experimentally and compared to other state-of-the-art crawling strategies for web pages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing large collections of continuous content-based RSS aggregation queries

In this article we present RoSeS (Really Open Simple and Efficient Syndication), a generic framework for content-based RSS feed querying and aggregation. RoSeS is based on a data-centric approach, using a combination of standard database concepts like declarative query languages, views and multi-query optimization. Users create personalized feeds by defining and composing content-based filterin...

متن کامل

RoSeS: A Continuous Content-Based Query Engine for RSS Feeds

In this article we present RoSeS (Really Open Simple and Efficient Syndication), a generic framework for content-based RSS feed querying and aggregation. RoSeS is based on a data-centric approach, using a combination of standard database concepts like declarative query languages, views and multiquery optimization. Users create personalized feeds by defining and composing content-based filtering...

متن کامل

Cobra: Content-based Filtering and Aggregation of Blogs and RSS Feeds

Blogs and RSS feeds are becoming increasingly popular. The blogging site LiveJournal has over 11 million user accounts, and according to one report, over 1.6 million postings are made to blogs every day. The “Blogosphere” is a new hotbed of Internet-based media that represents a shift from mostly static content to dynamic, continuously-updated discussions. The problem is that finding and tracki...

متن کامل

Reliability and Timeliness Analysis of Content-based Publish/subscribe Systems

Content-based Publish/subscribe systems (CBPS) is a simple yet powerful communication paradigm. Its content-centric nature is suitable for a wide spectrum of today’s content-centric applications such as stock market quote exchange, remote monitoring and surveillance, RSS news feed, and online gaming. As the trend shows that the amount of information along with its producers become astonishingly...

متن کامل

Automatic Content Syndication in Information Science: A Brazilian Experience in the Creation of RSS Feeds to e-journals

This paper reports the partial results of an exploratory study which intends to develop a methodology for a Web feed-based aggregation content service to electronic journals in Information Science. Ten scientific e-journals were chosen as sample to demonstrate the potential of the Web syndication technology. These e-journals are supported by the Brazilian Electronic Journal Publishing System (S...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010